CS 599 : Structure and Dynamics of Networked

نویسنده

  • Ashish Vaswani
چکیده

So far, we have mostly talked about communities in the sense of discovering one, or a few, densely linked subgraphs. We departed from this interpretation at the end of last lecture, when we defined the notion of the modularity of a clustering. There, we are interested in the division of a graph into disjoint partitions (or clusters) of nodes, and the quality of this clustering. Clustering of data, mostly in metric spaces, is one of the most well-studied problems in CS, mostly due to its applications to machine learning and classification. Here, we look at the relatively new concept of correlation clustering [1]. In correlation clustering, each edge of the graph is annotated with a label of ‘+’ or ‘-’, expressing that the two endpoints were observed to be similar or dissimilar, respectively. The goal is then to find a clustering that puts many ’+’ edges inside clusters, and ’-’ edges between clusters. However, these goals may be conflicting, as can be seen for a triangle with two edges labeled ’+’ and one labeled ’-’. Notice that the number of clusters is not pre-specified in this problem. This notion of clustering can be useful when we can identify if a link constitutes an endorsement, or the opposite. For instance, in many competitive scenarios (for instance, politics or sports), pages will link to other pages with the explicit goal of deriding the content. This can be frequently identified from anchor text and similar clues. In this sense, correlation clustering may help us in identifying communities with aligned interests, which compete with other communities. More formally, given the graph G = (V,E) on n vertices, we write +(i, j) if the edge between i and j is labeled ‘+’ and similarly for −(i, j). The optimization problem can now be expressed in two ways:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 599 : Structure and Dynamics of Networked Information ( Spring 2005 ) 01 / 24 / 2005 : The Efficacy of Collusions in Web Ranking and the Counter - measures

At Google, the rank depends on the product of relevance and importance. It is, however, difficult to say which of the two is a bigger factor in deciding the rank. An SEO can try to manipulate both of these factors in trying to boost the rank of the page. In this lecture, we’re concerned only with efforts to try and increase the importance of the page by manipulating the link structure around th...

متن کامل

CS 599 : Structure and Dynamics of Networked Information ( Spring 2005 ) 03 / 07 / 2005 : Power Law Degree Distributions

The existence of power law distributions (also known as heavy-tailed distributions) in various natural and man-made scenarios has been demonstrated empirically over the years [6], and attracted a great deal of interest, resulting in models that would naturally predict such distributions. The areas in which power laws have been observed are very diverse, as evidenced by the following, not nearly...

متن کامل

CS 599 : Structure and Dynamics of Networked Information ( Spring 2005 )

In the previous lecture, we looked at and analyzed the HITS algorithm, based on Hubs and Authorities. This algorithm was designed with a scenario in mind where authorities might not cite each other (for example, Ford will be very unlikely to link to Chevrolet). Hence, HITS uses the idea of conferring authority indirectly through hubs. In scenarios where authorities do cite each other, such as a...

متن کامل

CS 599 : Structure and Dynamics of Networked Information ( Spring 2005 ) 4 / 4 / 2005 : Epidemic Phenomena

One of the first models studied explicitly in this context was Schelling’s model for segregation [3]. Schelling was motivated by the question: why is it that most neighborhoods are very uniform (racially, and in other respects), even though most people profess that they would prefer to live in a diverse community? Schelling proposed the following model: Assume that roughly n 2 2 individuals liv...

متن کامل

CS 599 : Structure and Dynamics of Networked

During the last lecture, we started the analysis of an LP-rounding based approximation algorithm for minimizing disagreements in correlation clustering. Recall that in correlation clustering [1], we are given a graph each of whose edges is labeled either ‘+’ or ‘-’. The goal is to partition nodes into clusters so as to minimize the number of ‘+’ edges across clusters plus the number of ‘-’ edge...

متن کامل

Molecular Dynamics Simulations on Polymeric Nanocomposite Membranes Designed to Deliver Pipobromane Anticancer Drug

Three chitosan (CS), polyethylene glycol (PEG) and polylactic acid (PLA) nanocomposite systems containing SiO2 nanoparticles and water molecules were designed by molecular dynamics (MD) simulations to deliver pipobromane (PIP) anticancer drug in order to discover the most appropriate drug delivery system (DDS) in aqueous medium which was analogous to the human body. The density for the CS matri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005